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Abstract 

We  report  our  experiences  on  using  a  new  variant  of  the  Schwarz  preconditioned  GM- 
RES  methods  in  the  implicit  solution  of  the  unsteady  compressible  Navier-Stokes  equations 
discretized  on  two-dimensional  unstructured  meshes.  We  first  partition  the  global  mesh 
with  the  recursive  spectral  bisection  method  into  submeshes,  and  then  we  introduce  a  fam¬ 
ily  of  Schwarz  methods,  referred  to  as  the  Variable  Degree  Schwarz  methods  (VDS)  on  the 
overlapping  submeshes.  In  VDS,  the  subdomain  preconditioner  is  constructed  by  using  a 
polynomial  in  two  matrix  variables,  namely  the  matrix,  in  its  un-factorized  form,  of  the 
current  time  step  k  and  another  matrix,  in  its  factorized  form,  obtained  at  a  previous  time 
step  j.  The  degree  of  the  matrix  polynomial  in  each  subdomain  is  determined  automati¬ 
cally  so  that  extra  preconditioning  is  performed  only  in  subdomains  whose  associated  local 
matrices  have  large  condition  numbers.  The  extra  preconditioning  occurs  often  near  the 
body  of  the  airfoil.  We  show  numerically  that  VDS  is  very  effective.  Unlike  the  weU-known 
elliptic  theory,  we  observe  that  the  convergence  rate  of  VDS  preconditioned  GMRES  de¬ 
generates  very  mildly  without  a  coarse  space  for  reasonably  large  number  of  subdomains. 
We  also  study  the  effects  of  the  overlapping  size,  the  number  of  subdomains  and  the  level  of 
inexactness  of  the  subdomain  solvers.  The  other  purpose  of  the  study  is  to  understand  the 
robustness  of  the  Schwarz  methods  with  respect  to  flow  parameters,  such  as  the  CFL,  the 
free  stream  Mach  number  and  the  Reynolds  number.  Numerical  results  for  both  subsonic 
and  transonic  problems  are  reported. 
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1.  Introduction.  The  system  of  unsteady  compressible  Navier-Stokes  (N.-S.)  equa¬ 
tions  is  a  fundamental  system  in  fluid  dynamics.  To  be  able  to  solve  the  system  quickly 
and  accurately  in  complex  geometry  is  one  of  the  ultimate  goals  of  computational  fluid 
dynamics  [20].  To  achieve  the  goal,  several  important  techniques  have  been  developed  in 
the  past  few  years,  such  as  unstructured  grid  generations  for  complex  geometry,  stable, 
conservative  discretizations  of  the  N.-S.  equations,  unstructured  grid  partitionings,  as  well 
as  the  powerful,  robust  implicit  preconditioned  iterative  solvers  discussed  below.  Most  of 
the  techniques  are  developed  for  steady  state  calculations,  see  e.g.,  [1,  5,  16]  and  references 
therein.  In  this  paper,  we  put  all  the  techniques  together  and  introduce  a  robust  domain 
decomposition  based  fast  preconditioned  iterative  method  for  the  time  accurate  solution  of 
unsteady  problems. 

We  study  implicit  methods  for  solving  unsteady  N.-S.  equations  discretized  on  two- 
dimensional  unstructured  meshes  with  a  combined  finite  element/finite  volume  scheme  for 
the  spatial  variables  and  a  simple  backward  Euler  scheme  for  the  temporal  variable.  It 
is  well-known  that  the  main  advantage  of  implicit  methods  is  that  they  allow  the  time 
steps  to  be  determined  solely  based  on  the  physics  of  the  fluid  flow,  not  on  the  stability 
property  of  the  time  discretization  scheme,  [4,  30,  31].  However,  to  advance  in  time,  a 
large,  sparse,  nonsymmetric  linear  system  of  equations  must  be  constructed  and  solved  at 
each  time  step.  Depending  of  the  size  of  the  time  step,  and  several  other  flow  parameters, 
the  conditioning  of  the  matrix  may  change  from  well-conditioned  to  mildly  ill-conditioned. 
And  due  to  the  complexity  of  the  flow  pattern,  at  a  given  time  step,  the  matrix  may  be  ill- 
conditioned  in  certain  subregions  near  the  airfoil  and  relatively  well-conditioned  elsewhere. 
Details  about  the  local  conditioning  of  the  matrix  wiU  be  discussed  later.  To  solve  these 
systems  iteratively,  it  is  necessary  to  have  a  family  of  preconditioners  whose  strength  can  be 
adjusted  locally  in  each  subdomain  according  to  the  flow  condition.  Overlapping  Schwarz 
methods  (OSM)  is  a  family  of  preconditioners  for  solving  large  sparse  hnear  systems  arising 
from  the  discretization  of  partial  differential  equations,  see  e.g.  [6,  8,  11,  27].  It  was 
originally  designed  for  scalar  linear  elliptic  problems.  We  find  OSM  to  be  very  attractive 
for  our  purpose  because  (1)  they  are  more  paraUelizable  than  the  popularly  used  global  ILU 
preconditioners;  (2)  they  are  efficient  for  nonsymmetric  and  indefinite  problems;  (3)  they 
have  mesh  independent  convergence  rates,  at  least  for  elliptic  finite  element  problems;  (4) 
they  have  adjustable  strength  controlled  by  using  the  inexact  solution  techniques  for  solving 
local  problems.  We  shall  further  explore  the  flexibility  of  OSM  at  the  subdomain  level  and 
introduce  a  new  variant  below. 

It  is  well-known  that  when  constructing  a  preconditioner  for  solving  a  single  system  of 
linear  equations,  Au  =  /,  all  the  information  needs  to  be  from  the  matrix  A.  However, 
the  issue  for  time  dependent  problems  is  different.  A  sequence  of  inter- related  systems 
A^’^^u  =  has  to  be  solved.  If  the  matrix,  and  especially  in  its  (often  inexactly)  factorized 
form,  obtained  at  a  previous  time  step  can  be  properly  used,  then  the  preconditioner  at 
the  current  time  step  can  be  obtained  cheaply.  More  precisely  speaking,  at  each  time 
step,  we  solve  the  global  linear  system  by  a  preconditioned  GMRES  method  ([26])  and  in 
the  preconditioning  stage,  following  the  general  overlapping  Schwarz  framework,  we  solve 
the  local  subdomain  problems  by  another  preconditioned  GMRES  method,  with  different 
preconditioners  and  stopping  conditions.  In  each  subdomain  the  preconditioner  is  built  by 
using  a  polynomial  in  two  matrix  variables,  namely  the  matrix,  in  its  un-factorized  form,  of 
the  current  time  step  k  and  another  matrix,  in  its  factorized  form,  obtained  at  a  previous 
time  step  j.  The  degree  of  the  matrix  polynomial  reflects  the  conditioning  of  the  subdomain 
matrix.  Note  that  classical  Schwarz  methods  correspond  to  the  case  where  the  degree  of 
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the  matrix  polynomials  always  equals  to  one.  In  our  method,  the  degree  of  the  polynomial 
varies  from  subdomain  to  subdomain  depending  the  flow  conditions,  and  therefore  we  refer 
to  the  methods  as  variable  degree  Schwarz  methods  (VDS). 

In  this  paper,  we  also  investigate  the  difference  between  the  Schwarz  family  of  pre¬ 
conditioners  and  other  methods  such  as  the  simple  Jacobi  iterative  method  and  the  global 
ILU  preconditioned  iterative  method.  Within  the  Schwarz  preconditioners,  we  try  to  un¬ 
derstand  the  role  of  the  overlapping  size  between  subdomains,  the  effect  of  the  number  of 
subdomains,  and  the  effect  of  the  inexact  subdomain  solvers.  Since  the  construction  of 
the  preconditioner  is  expensive,  we  explore  the  possibility  of  re-using  the  preconditioner  for 
several  time  steps. 

We  restrict  our  attention  to  sequential  computers,  and  single  level  Schwarz  algorithms. 
For  steady  state  problems,  some  studies  can  be  found  in  [18].  For  other  recent  developments 
in  unsteady  calculations,  we  refer  the  readers  to  [2,  30,  31).  Our  focus  is  on  the  Schwarz 
algorithms,  therefore  only  the  simplest  time  discretization,  namely  the  first  order  backward 
Euler  scheme,  is  considered  in  this  paper.  Higher  order  schemes  and  their  influence  on  the 
Schwarz  preconditioners  will  be  discussed  in  a  forthcoming  paper.  The  physical  model  we 
choose  to  test  our  algorithms  contains  a  single  element  NACA0012  airfoil  at  a  rather  large 
angle  of  attack  with  a  modest  Reynolds  number.  Both  subsonic  and  transonic  cases  are 
studied  in  the  paper.  The  paper  is  organized  as  follows.  In  §2,  we  discuss  the  unsteady 
compressible  N.-S.  equations  in  the  conservative  form,  the  boundary  conditions  and  a  dis¬ 
cretization  scheme.  In  §3,  we  study  a  preconditioned  iterative  method  and  introduce  a 
new  variant  of  the  overlapping  Schwarz  preconditioners.  Numerical  experiments  for  both 
subsonic  and  transonic  flows  are  reported  in  §4.  §5  includes  a  few  final  remarks. 


2.  The  two-dimensional  unsteady  N.-S.  equations.  In  this  section,  we  describe 
the  two-dimensional  unsteady  compressible  N.-S.  equations  in  its  conservative  form.  We  also 
discuss  the  spatial  and  temporal  discretizations  of  the  equations  on  unstructured  meshes. 
Following  the  notions  of  Farhat,  Fezoui  and  Lanteri  [12,  13],  and  Fezoui  and  Stoufflet  [15], 
for  the  spatial  variables,  we  use  a  combined  finite  element /finite  volume  scheme  and  for  the 
temporal  variable  we  use  a  simple  backward  Euler  method.  The  scheme  is  of  second  order 
in  space  and  first  order  in  time. 


2.1.  Governing  equations.  Let  C  be  the  flow  domain  and  F  its  boundary,  as 
shown  in  Fig.  1.  The  conservative  form  of  the  N.-S.  equations  is  given  by 
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Here  the  functions  Fi  and  F2  denote  the  convective  fluxes 
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and  the  functions  Ri  and  R2  denote  the  diflFusive  fluxes 
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In  the  above  expressions,  p  is  the  density,  U  =  (u^v)^  is  the  velocity  vector,  E  is  the  total 
energy  per  unit  of  volume,  and  p  is  the  pressure.  These  variables  are  related  by  the  state 
equation  for  perfect  gas 

P=(7-1)  (E-\p\\Ufy 


where  7  denotes  the  ratio  of  specific  heats  (7  =  1.4,  for  air).  The  specific  internal  energy  e 
is  related  to  the  temperature  via 


In  the  diffusive  fluxes,  t^x,  r^y,  and  Tyy  are  the  components  of  the  two-dimensional  Cauchy 
stress  tensor,  k  is  the  normalized  thermal  conductivity,  Pr  =  poCp/ko  is  the  Prandtl  number 
(Pr  =  0.72,  for  air),  and  Re  =  poUoLo/po  is  the  Reynolds  number,  where  po,  Uq,  Lq,  and 
po  denote  the  characteristic  density,  velocity,  length,  and  diffusivity,  respectively.  The 
components  of  Cauchy  stress  tensor  are  related  to  the  velocity  via 
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where  p  denotes  the  normalized  viscosity. 

2.2.  Boundary  conditions.  We  are  interested  in  unsteady,  external  flows  around  an 
airfoil  as  pictured  in  Fig.l.  The  domain  boundary  is  F  =  F^,  U  Foo  and  the  far  field  velocity 
is  Uoo-  On  the  wall  boundary  r^,,  a  no-slip  condition  on  U  and  a  Dirichlet  condition  on  the 
temperature  T  are  imposed,  i.e., 

(2)  U  =  6,  and  T  =  Tw 


No  boundary  condition  is  specified  for  the  density.  In  the  far  fieW,  the  viscous  effect  is 
assumed  to  be  negligible,  therefore  a  uniform  free-stream  velocity  Uoo  is  imposed  on  Too 
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and  Poo  = 


where  a  is  the  angle  of  attack,  and  Moo  is  the  free-stream  Mach  number. 

2.3.  Discretization.  Let  the  temporal  variable  t  be  discretized  as  -f 

where  6t'‘  is  the  discrete  time  increment  and  t®  =  0.  We  consider  the  increment  SW^  = 
where  is  an  approximation  of  W(-,t^)  .  Note  that  when  an  algorithm  is 
written  in  the  “delta”  form  [3,  28],  the  increment  is  the  unknown  variable  rather  than 
W^.  Here,  we  use  a  first-order  finite  difference  approximation  for  the  temporal  variable, 
namely,  the  backward  Euler  scheme  given  as 

^  ■  Vi)  fw’- 
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Fig.  1.  The  computational  domain  Q  and  its  wall  and  far  field  boundaries. 
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where  and  72*“^  are  approximations  of  and  72(W'’(-,t*“^)),  respectively. 

We  determine  the  time  step  size  in  the  following  way.  Let  CFL  be  a  pre-selected  positive 
number.  For  each  element,  with  size  hi,  of  the  finite  element  mesh,  we  define  an  element 
time  step  size  by 


cA  L  _ _ 
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and  then  the  global  time  step  is  defined  by 

(5)  St'^  =  min{^tf }. 

i 

Here  Ci  is  the  element  sound  speed,  and  Ui  is  the  element  velocity  vector. 

The  computational  domain  is  discretized  by  a  triangular  grid  as  pictured  in  Fig.2.  We 
use  unstructured  grids  since  they  provide  flexibility  for  tessellating  about  complex  geome¬ 
tries  and  for  adapting  to  flow  features,  such  as  shocks  and  boundary  layers.  We  locate  the 
variables  at  the  vertices  of  the  grid,  which  gives  rise  to  a  cell-vertex  scheme.  The  space  of 
solutions  is  taken  to  be  the  space  of  piecewise  linear  continuous  functions.  The  discrete  sys¬ 
tem  is  obtained  via  a  mixed  Galerkin  finite  element /finite  volume  formulation;  see  Farhat 
et  al.  [12,  13],  and  Fezoui  and  Stoufflet  [15]  for  details.  In  short,  the  discretized  system  for 
(4)  is  obtained  as  follows: 

•  For  the  time  derivative  of  (4)  we  use  a  “mass-lumping”  technique,  in  which  we 
replace  the  mass  matrix  by  some  diagonal  matrix. 

•  For  the  convective  terms  of  the  left-hand  side  of  (4),  we  use  a  first  order  scheme 
that  is  an  extension  of  Van  Leer’s  MUSCL  [29]  scheme  to  the  case  of  unstructured 
grids  (see  Fezoui  and  Stoufflet  [15])  with  a  Roe  approximate  Riemann  solver  [24]. 

•  For  the  diffusive  terms  of  the  left-hand  side  of  (4),  we  use  a  Galerkin  finite  element 
(first-order  quadrature  integration). 

•  For  the  convective  terms  of  the  right-hand  side  of  (4),  we  use  a  second-order  scheme 
that  is  again  an  extension  of  Van  Leer’s  MUSCL  scheme  to  the  case  of  unstructured 
grids  (see  Fezoui  and  Stoufflet  [15])  with  a  Roe  approximate  Riemann  solver  [24]. 
We  also  use  Van  Albada’s  limiting  procedure  to  reduce  numerical  oscillations  of  the 
solutions. 

•  For  the  diffusive  terms  of  the  right-hand  side  of  (4),  we  use  a  regular  Galerkin  finite 
element  method  (second-order  quadrature  integration). 
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Fig.  2.  The  top  left  figure  shows  the  un-decomposed  nonuniform  shape  regular  finite  element  mesh.  The  top 
right  figure  shows  the  decomposition  of  the  mesh  into  non^ overlapping  subdomains,  and  the  bottom  figure  is 
a  blow-up  of  the  figure  around  the  airfoil. 


Although  both  approximations  we  use  for  the  left-hand  side  of  (4)  are  spatially  first- 
order,  they  operate  on  the  increment  and  as  a  consequence  the  resulting  scheme  is 
spatially  second-order  for  any  fixed  6t.  We  assume  satisfies  strongly  the  wall  boundary 
conditions  on  and  the  far  field  boundary  conditions  at  infinity.  For  the  initial  values, 
satisfies  strongly  the  wall  boundary  conditions  on  and  the  far  field  boundary  conditions 
at  Too-  For  the  initial  values  and  at  the  interior  nodes  of  fl,  W°  takes  the  free-stream 
boundary  condition. 

Putting  pieces  of  the  discretization  together,  at  each  time  step,  we  obtain  a  large,  sparse, 
generally  nonsymmetric  linear  system  of  equations  of  the  form 

(6)  +  = 


where  is  a  diagonal,  lumped  mass  matrix,  is  the  sum  of  the  convective  and  diffusive 
terms  on  the  left-hand  side  of  (4)  discretized  by  the  finite  volume  and  finite  element  methods, 
respectively,  is  the  discretized  right-hand  side  of  (4),  and  is  the  approximate  nodal 
value  of  SW'^  at  the  finite  element  mesh  points.  We  solve  (6)  by  a  preconditioned  Krylov 
iterative  method  with  a  preconditioner  to  a  certain  linear  tolerance  r,  i.e.. 


(7) 
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To  simplify  the  discussion,  we  shall  use  -|-  in  the  rest  of  the  paper.  The 

preconditioner  will  be  introduced  in  the  next  section. 

3.  Solution  methods  and  variable  degree  Schwarz  preconditioners.  In  this 
section,  we  first  briefly  recall  the  classical  additive  and  multiplicative  Schwarz  algorithms. 
Then  we  introduce  a  variable  degree  version  of  Schwarz  algorithms  that  is  more  suitable 
for  solving  time  dependent  problems.  Unlike  steady  state  problems,  if  certain  information 
or  matrix  factorization  obtained  at  the  previous  time  steps  can  be  used  a  great  deal  of 
calculations  can  be  saved. 

Let  us  review  the  Schwarz  algorithms.  Suppose  that,  at  time  step  A:,  we  need  to  solve 
a  linear  system  of  equations 


where  is  an  explicitly  constructed,  nonsymmetric  and  sparse  matrix  with  symmetric 
non-zero  pattern.  Since  our  main  interest  is  on  the  multicomponent  N.-S.  equations  in  two- 
dimensional  spaces,  each  element  of  can  be  considered  as  a  small  4x4  matrix,  and  each 
unknown  of  the  vector  is  a  ‘4-size’  vector.  Thus,  there  is  a  bijection  between  unknowns 
and  vertices.  We  denote  the  set  of  vertices  (or  nodes)  by  =  {1,  •  •  • ,  n},  where  n  represents 
the  total  number  of  nodes  (or  unknowns).  To  define  algebraic  Schwarz  algorithms,  see  e.g. 
[6],  we  first  partition  the  set  Af  into  no  nonoverlapping  subsets  {Afi}  whose  union  is  Af.  We 
use  the  TOP/DOMDEC  mesh  partitioning  package  of  Farhat  et  al.  [14]  to  obtain  sets  Afi. 
We  use  the  recursive  spectral  bisection  method  ([22])  with  certain  optimization  to  obtain 
roughly  the  same  number  of  interior  and  boundary  nodes  in  each  Afi,  and  also  to  obtain 
good  aspect  ratio  on  the  subgrids.  To  generate  an  overlapping  partitioning  with  overlap 
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ovlp,  we  further  expand  each  subgrid  Mi  by  ovlp  number  of  neighboring  nodes,  denoted  as 

Mi- 

We  denote  by  i,  the  vector  space  spanned  by  the  set  Mi-  For  each  subspace  i,,  we 
define  an  orthogonal  projection  operator  /,■  as  follows:  is  a  n  x  n  block  diagonal  matrix 
whose  elements  are  4  x  4  identity  matrices  if  the  corresponding  nodes  belong  to  Mi  and  to 
4x4  zero  matrices  otherwise.  With  this  we  define 

, 

which  is  an  extension  to  the  whole  subspace,  of  the  restriction  of  to  Li.  Note  that 
although  is  not  invertible  in  the  full  space,  its  restriction  to  the  subspace  spanned  by 
Mi  is  nonsingular,  and  we  define  its  inverse  by 

The  classical  additive  and  multiplicative  Schwarz  algorithms  can  now  be  simply  described 
as  follows:  Solve  the  linear  system 

MA^'^K  =  M/W 

by  a  Krylov  subspace  method,  where 

(8)  M  =  +  +  , 

for  the  additive  Schwarz  algorithm,  and 

(9)  M  =  /-(/-  (Af  ••■(/-  (aW)-^A(*'>) 

for  the  multiplicative  Schwarz  algorithm. 

It  has  been  shown  that  the  above  mentioned  algorithms  are  very  successful  for  steady 
state  CFD  problems,  see  e.g.,  [5,  21]  and  also  [9].  There  are  three  major  steps  in  the 
construction  of  the  Schwarz  preconditioners,  namely  1)  the  construction  of  the  matrix  A^^'F 
2)  the  construction  of  the  matrices  a|*^;  and  .3)  the  incomplete  factorization  of  the  matrices 
a\'^\  In  fact  Step  1)  is  not  necessary  since  the  matrices  constructed  in  Step  2)  can  be  used 
to  calculate  the  matrix-vector  multiplications.  Since  we  are  interested  in  implicit  methods. 
Step  2)  has  to  be  done  at  every  time  step  no  matter  how  expensive  it  is.  One  expensive  step 
in  the  construction  of  the  preconditions  as  formulated  above  for  time  dependent  problems 
is  Step  3).  One  way  to  avoid  the  frequent  factorization  of  Aj^^  is  to  simply  use  some 
old  factorized  matrix  A^^  calculated  at  time  step  j,  where  j  <  k.  However,  this  method 
may  not  be  very  effective  if  j  and  k  are  too  far  apart.  More  discussions  on  using  frozen 
preconditioners  can  be  found  later  in  the  paper. 

Another  problem  with  the  Schwarz  preconditioners  (8)  and  (9)  is  that  aU  subdomains 
are  treated  equally  in  terms  of  the  level  of  preconditioning  in  the  sense  that  the  number  of 
applications  of  or  its  inexact  version,  is  the  same  on  ail  subdomains,  regardless  of 

the  fact  that  the  subdomain  matrices  Ap^  have  vary  different  condition  numbers.  Physically 
speaking,  the  behavior  of  the  partial  differential  equations  in  subdomains  near  the  body  of 
the  airfoil,  or  near  the  shocks  is  very  different  from  the  regions  that  are  far  from  the 
subdomains  where  the  real  actions  take  place. 
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Here  we  propose  a  method  that  places  different  level  of  preconditioning  in  different 
subdomains  and  will  also  show  by  numerical  experiments  that  the  methods  remain  effective 
even  if  j  and  k  are  far  apart  from  each  other.  The  idea  is  simple.  We  replace  the  required 
matrix- vector  multiply  in  (8)  or  (9) 


(10)  w  = 

by  another  iterative  procedure  with  (5^  preconditioner.  Here  is  an  incom¬ 

plete  factorization  of  with  certain  levels  of  fill-ins.  More  precisely  speaking,  to  obtain 
w  for  a  given  v,  we  run  several  steps  of  GMRES  in  the  subspace  X,  to  drive  the  residual 


(11) 


We  then  set  w  :=  w.  Here  6  is  pre-selected  small  value.  Examples  can  be  found  in  §4  of 
this  paper.  In  the  matrix  language,  we  replace  the  matrix  (A)  in  (10)  by  a  matrix 
polynomial 


polyi  ((SpV'Af^) 


of  a  certain  degree,  which  depends  of  the  number  of  GMRES  iterations  needed  in  the 
subspace  i,.  To  put  them  into  a  single  form,  the  additive  Schwarz  preconditioner  becomes 

M  =  polyi  ((M^’V^aS''^)  +  ---  +  polyn,  ((5W)-^aW)  . 


Note  that  this  preconditioner  does  not  contain  (A^^^)  but  it  contains  certain  spectral 
information  of  (Ap^)“^  This  makes  it  very  effective.  In  fact,  M  is  a  truncated  series  rep¬ 
resentation  of  (a(*V^  based  on  a  splitting  of  A^*"^  into  the  sum  of  and  A^^  - 
A  discussion  on  a  related  polynomial  preconditioning  method  can  be  found  in  [17].  We 
note  that  in  a  given  subdomain,  the  number  of  GMRES  iterations,  or  the  degree  of  the 
polynomial,  is  determined  by  the  conditioning  of  the  local  stiffness  matrix.  The  multiplica¬ 
tive  version  can  be  constructed  in  a  similar  way.  The  extension  to  the  local  multiplicative 
Schwarz  method  ([7])  is  also  straightforward. 

We  remark  that  since  the  preconditioner  changes  in  the  GMRES  loop  due  to  the  stop¬ 
ping  condition  determined  by  it  is  generally  more  appropriate  to  use  the  so-caUed  flexible 
GMRES  [25],  which  is  slightly  more  expensive  than  the  regular  one.  We  do  not  use  the 
flexible  GMRES  since  the  regular  GMRES  presents  no  problem  for  our  test  cases.  The 
implementation  of  the  methods  is  rather  complicated  because  of  the  use  of  multi-layered 
Krylov  iterations  as  a  global,  or  outer,  and  also  local  solvers.  PETSc  makes  our  numerical 
tests  possible.  More  details  regarding  to  the  implementation  will  be  discussed  in  the  next 
section. 

4.  Numerical  results.  The  goal  of  this  section  is  to  demonstrate  the  usefulness  of  the 
family  of  VDS  preconditioners  in  the  implicit  solution  of  compressible  flow  problems,  and  to 
compare  the  efiectiveness  of  the  methods  with  various  other  methods,  such  as  the  pointwise 
Jacobi  iterative  method  and  the  global  ILU(O)  preconditioned  GMRES  method,  for  both, 
subsonic  and  transonic  flows.  The  experiments  were  performed  on  a  DEC  Alpha  workstation 
(275MHz,  512MB  memory),  and  the  software  was  written  by  using  the  newly  developed 
package  PETSc  [19]  of  the  Argonne  National  Laboratory.  All  arithmetic  operations  are  in 
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double  precision.  The  system  BIAS  library  (dxml  [10])  was  used.  Only  sequential  results 
are  reported  here.  A  parallel  version  of  the  code  is  being  developed,  and  the  results  will  be 
reported  in  the  future.  Here  we  apply  our  computational  algorithms  to  the  simulation  of 
two-dimensional  low  Reynolds  number  chaotic  flows  past  a  NACA0012  airfoil  at  high  angle 
of  attack  and  two  different  Mach  numbers.  It  was  shown  in  Pulliam  [23]  that  such  flows 
can  be  resolved  with  a  reasonable  number  of  grid  points.  The  accuracy  of  the  computed 
solutions  are  substantiated  by  successive  mesh  refinements  and  comparisons  with  the  results 
were  reported  in  [2.3].  No  steady  state  solutions  exists  for  both  test  cases  described  below. 

Test  1:  The  subsonic  case  with  free  stream  Mach  number  Moo  =  0.1  and  Re  =  800.0. 
We  use  a  pre-generated  shape  regular  triangular  mesh  with  12280  nodes;  see  Fig.2  for 
example.  The  Mach  surfaces  of  the  computed  solution  at  various  time  steps  are  given  in 
Fig.5. 

Test  2:  The  transonic  case  with  free  stream  Mach  number  Moo  =  0.84  and  Re  =  1600.0. 
We  use  a  mesh  with  48792  nodes  obtained  by  uniformly  refining  the  mesh  used  in  Test  1. 
The  Mach  surfaces  of  the  computed  solution  at  various  time  steps  are  given  in  Fig. 6. 

Table  1 

Total  CPU  hours  and  time  steps  spent  for  calculating  the  lift  curves  using  explicit  and  implicit  methods  with 
different  CFL  numbers.  The  total  non-dimensionalized  time  interval  is  (0,  25)  for  the  Moo  =  0.1  case  and 
(0,  10)  for  the  Mcc  =  0.84  case. 


Expl. 

CFL=0.8 

Impl. 

CFL=25 

Impl. 

CFL=50 

Impl. 

CFL=100 

Moo  =  0.1 

CPU(hours) 

54.75 

22.84 

13.26 

7.58 

Meshl2k 

Time  steps 

196091 

6018 

3009 

1504 

Moo  =  0.84 

CPU(hours) 

51.89 

14.26 

9.20 

6.06 

Mesh48k 

Time  steps 

48216 

1389 

694 

.347 

In  the  following  discussions,  we  shall  refer  to  these  two  meshes  as  “Meshl2k”  and 
“Mesh48k”,  respectively.  In  the  implementation  of  Schwarz  preconditioners,  we  partition 
the  mesh  by  using  the  TOP/DOMDEC  package  [14],  which  implements  the  recursive  spec¬ 
tral  bisection  method.  We  require  that  all  subdomains  have  more  or  less  the  same  number 
of  mesh  points.  An  effort  is  made  to  reduce  the  number  of  mesh  points  along  the  interfaces 
of  subdomains,  which  may  be  needed  later  in  our  parallel  code  to  reduce  the  communica¬ 
tion  cost.  The  mesh  generation  and  partitioning  are  considered  as  pre-processing  steps,  and 
therefore  not  counted  toward  the  CPU  time  reported  in  Table  1.  The  sparse  matrix  defined 
by  (4)  is  constructed  at  every  time  step,  and  stored  in  the  Compressed  Sparse  Row  format. 
The  subdomain  matrices  are  obtained  by  taking  elements,  according  to  a  pre-selected  index 
set,  from  the  global  matrix.  A  symbolic  ILU(O)  factorization  of  the  subdomain  matrix  is 
performed  at  the  very  first  time  step,  and  re-used  at  all  the  later  time  steps.  This  is  possible 
due  to  the  fact  that  the  matrices,  constructed  at  every  time  step,  share  the  same  non-zero 
pattern.  We  also  tested  the  ILU(A;)  (k  >  0)  preconditioners,  which  are  not  competitive  with 
ILU(O)  in  terms  of  the  CPU  time  in  our  implementation  for  both  test  cases.  We  remark  that 
if  ILU  with  drop  tolerance  is  used  then  the  non-zero  pattern  of  the  matrices  may  change 
and  therefore  the  symbolic  factorization  may  not  be  very  useful. 

We  note  that  at  the  beginning  of  the  flow  movement,  i.e.,  when  the  non-dimensional- 
ized  time  t  <  1.0,  the  flow  changes  so  drastically  that  the  use  of  any  that  makes  the 
corresponding  CFL  number  larger  than  1.0  would  result  in  the  loss  of  time  accuracy  for  the 


9 


Fig.  3.  Lift  curves  obtained  by  explicit  and  implicit  methods.  The  left  figure  shows  the  subsonic  cases  with 
Moo  0.1,  Re  =  800.0  and  the  number  of  mesh  points  is  12280.  The  right  figure  is  for  the  transonic  cases 
with  Moo  =  0.84,  Re  =  1600.0  and  the  number  of  mesh  points  is  48792. 


entire  calculation.  This  implies  that  small  Sf^  have  to  be  used  when  t  <  1.0,  and  therefore, 
the  implicit  method  has  to  be  abandoned  for  this  initial  period  of  time.  In  our  experiments, 
the  implicit  solver  is  turned  on  at  /  =  1.0.  The  solution  for  the  period  0  <  t  <  1.0  is 
obtained  with  the  explicit  method  with  CFL=0.8. 

4.1.  The  effect  of  using  large  CFL  numbers.  One  of  the  biggest  advantages  of 
implicit  methods  is  that  large  time  steps,  or  large  CFL  numbers,  can  be  used.  We  first 
examine  this  claim  by  comparing  the  time  accuracy  of  two  solutions  obtained  by  using  the 
implicit  methods  with  CFL=25,  50  and  100,  respectively,  and  the  solution  obtained  by  a 
second  order  (in  time  and  space)  explicit  method,  [12].  From  Fig.3,  one  can  easily  see  that 
no  two  different  CFL  numbers  give  identical  solutions.  However,  the  important  thing  for 
engineering  purposes  is  to  capture  correctly  the  “phase”  and  “amplitude”  of  the  lifts.  Small 
errors  in  amplitude  and  phases  are  usually  admissible.  This  can  often  substantially  shorten 
the  turnaround  times  in  the  initial  aerodynamic  design  and  analysis. 

For  the  implicit  methods,  we  use  the  multiplicative  VDS  preconditioned  GMRES  method 
as  the  global  linear  solver.  The  subdomain  preconditioners  are  re-computed  at  every  5  time 
steps.  In  the  Schwarz  methods,  we  use  8  subdomains,  and  ovip  =  1.  Each  subdomain  prob¬ 
lem  is  solved  by  an  ILU(O)  preconditioned  GMRES  method  with  the  stopping  tolerance  set 
to  ^  =  10“^  The  stopping  tolerance  for  the  global  linear  system  solver  is  r  =  10“®.  We 
also  test  several  cases  with  smaller  r,  such  as  10"'*  and  10“®.  The  resulting  lift  curves  are 
not  distinguishable  from  the  ones  shown  in  Fig.3. 

In  Table  1,  we  report  the  total  number  of  time  steps  and  CPU  time  in  hours  spent  on 
the  entire  computation,  not  including  the  mesh  generation  and  partitioning.  We  observe 
from  our  experiments  that  even  with  a  CFL  number,  100,  not  much  time  accuracy  is  lost 
for  a  certain  period  of  time,  see  Fig.3.  However,  if  the  time  accuracy  is  required  for  a  longer 
period  of  time,  we  do  recommend  a  smaller  CFL  number.  We  remark  that  our  discussions 
here  on  the  effects  on  using  large  CFL  numbers  is  based  on  our  first  order  time  discretization 
scheme.  The  results  may  change  slightly  if  higher  order  schemes  are  used. 


4.2.  The  Schwarz  parameters.  The  number  of  subdomains  and  the  size  of  overlap 
are  two  important  parameters  for  Schwarz  methods.  We  here  test  the  additive  and  mul¬ 
tiplicative  VDS  methods  for  both  test  cases.  Instead  of  running  the  entire  calculation  as 
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we  did  in  the  previous  section,  we  run  the  tests  for  only  100  time  steps  starting  at  t  =  1.0. 
In  terms  of  the  non-dimensional  time,  this  means  time  intervals  (1,1.86)  for  Test  1  and 
(1,2.28)  for  Test  2. 

In  the  rest  of  the  paper,  we  shall  use  Maxit  to  denote  the  maximum  number  of  global 
GMRES  iterations  and  Totallt  the  total  numbers  of  global  GMRES  iterations  within  this 
100  linear  system  solves.  To  measure  the  approximate  cost  of  the  methods  without  includ¬ 
ing  any  machine  dependent  factors,  we  use  EMatVec  to  denote  the  equivalent  number 
of  matrix-vector  multiplications,  which  includes  the  actual  stiffness  matrix-vector  multipli¬ 
cations  and  the  preconditioning-matrix-vector  multiplications.  Since  different  subdomains 
may  need  different  number  of  matrix- vector  multiplications,  we  take  the  average  over  all 
subdomains  and  convert  it  into  a  multiple  of  the  equivalent  global  matrix-vector  multipli¬ 
cations.  Suppose  that  h  is  the  size  of  the  global  matrix.  Note  that  n  is  4  times  the  number 
of  mesh  points.  Then,  one  global  matrix-vector  multiplication  requires  roughly  28ra  flops. 
Our  primary  iterative  solver  GMRES  has  a  complexity  of  /(/  +  2)h  + 1 x(cost  of  a  precon¬ 
ditioned  matrix-vector  multiplication).  Here  I  is  the  number  of  iterations.  For  example, 
the  pure  GMRES  cost,  without  counting  the  cost  of  the  matrix- vector  multiplications,  for 
4  iterations  is  about  24n,  which  is  a  little  less  than  the  cost  of  doing  1  global  stiffness 
matrix- vector  multiplication. 

Let  us  first  discuss  the  dependence  of  the  convergence  rate  on  the  number  of  subdo¬ 
mains.  We  use  5  different  decompositions  of  with  both  Meshl2k  and  Mesh48k.  The 
number  of  subdomains  goes  from  8  to  128.  We  run  both  Test  1  and  2,  with  ovlp  equals  to 
one  fine  mesh  cell.  In  Table  2,  we  present  the  maximum  number  of  global  GMRES  iterations 
within  one  hundred  time  steps  and  its  corresponding  EMatVec.  If  multiplicative  VDS  is 
used  even  without  the  special  subdomain  coloring  or  ordering,  Maxit  is  independent  of  the 
number  of  subdomains,  for  reasonably  large  number  of  subdomains,  such  as  128.  An  inter¬ 
esting  case  is  shown  on  the  top  left  portion  of  Table  2,  which  indicates  that  if  additive  VDS 
is  used  for  the  subsonic  problem,  the  number  of  maximum  iterations  does  grow,  though  not 
very  fast,  as  the  number  of  subdomains  becomes  large.  In  this  case,  we  believe  that  a  coarse 
space  may  be  useful  to  reduce  the  dependence  on  the  number  of  subdomains.  However,  we 
have  not  implemented  the  coarse  grid  solver  yet.  For  transonic  problems,  our  tests  show 
that  the  use  of  a  coarse  level  grid  is  not  necessary  with  both  additive  and  multiplicative 
VDS  preconditioners. 

Whether  overlap  is  useful  or  not  is  a  rather  subtle  issue.  It  depends  on  the  global  linear 
stopping  parameter  r  defined  in  (7)  and  the  local  linear  stopping  parameter  6  defined  in 
(11).  In  Table  3,  we  report  the  case  for  6  =  10~®  and  varying  r.  According  to  the  results  in 
Table  3  and  a  large  number  of  other  tests  we  did  (not  being  reported  here),  large  oyerlaps 
can  reduce  the  number  of  iterations  and  CPU  time  only  if  the  stopping  parameter  6  is  small. 
In  our  situation  when  r  =  10“^,  we  find  6  =  10“^  offers  the  best  CPU  results,  and  therefore 
we  do  not  need  large  overlaps.  In  the  rest  of  the  tests,  we  shall  use  this  set  of  r  and  6,  with 
1  overlap. 

We  next  exam  the  VDS  methods.  We  focus  on  the  case  with  8  subdomains,  and  use 
GMRES/ILU(0)  as  the  local  subdomain  solvers.  The  partitionings  used  for  Meshl2k  and 
Mesh48k  are  different.  The  subdomains  are  numbered  as  in  Fig.4.  The  results  obtained 
for  one  hundreds  time  steps  starting  at  t  =  1.0  are  summerized  in  Table  4.  We  have  also 
run  the  test  for  t  equals  to  other  values  and  the  results  are  more  or  less  the  same.  We 
use  the  same  local  stopping  condition,  namely  6  =  10~^  for  all  subdomains  and  for  both 
subsonic  and  transonic  problems.  It  turns  out  the  required  degrees  of  local  preconditioning 
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Fig.  4.  The  left  figure  shows  the  partitioning  of  Meshl2k  into  8  subdomains  and  the  right  ones  shows  that 
for  Mesh48k, 

polynomials  are  quite  different.  For  the  subsonic  case,  subdomains  Qq  and  fis  need  more 
iterations  (4  and  6  respectively)  than  other  subdomains.  The  left  picture  of  Fig.4  shows 
that  these  two  subdomains  cover  the  top  portion  of  the  airfoil.  Only  two  iterations  are 
needed  for  subdomains  that  are  far  away  from  the  airfoil,  such  as  fii,  SI2  and  fla.  The 
number  of  iterations  reflects  the  conditioning  of  the  subdomain  matrix.  For  the  transonic 
case,  all  subdomains  need  either  one  or  two  more  iterations. 

For  both  test  problems.  Table  2  and  Table  4  show  that  both  the  number  of  global 
iterations  and  the  number  of  local  iterations  are  surprisingly  small,  which  indicate  that  the 
linear  system  of  equations  (6)  is  in  fact  not  too  ill-conditioned.  We  believe  that  this  is  due 
to  the  use  of  relatively  small  time  steps  (5). 

4.3.  A  comparison  with  the  pointwise  Jacobi  iterative  method.  For  compari¬ 
son  purpose,  we  solve  both  test  problems  by  using  the  simplest  iterative  method,  namely  the 
pointwise  Jacobi  method,  which  is  often  referred  to  as  the  Jacobi  preconditioned  Richard¬ 
son’s  method.  Jacobi  method  has  a  few  very  attractive  features,  such  as  being  easy  to 
parallelize.  Note  that  for  multicomponent  test  problems,  a  point  corresponds  to  a  4  x  4 
matrix.  When  using  the  Jacobi  method,  a  dense  4x4  matrix  needs  to  be  inverted  at 
every  mesh  point.  In  our  experiments,  at  each  time  step  before  solving  the  linear  system, 
we  compute  the  inverse  of  these  4x4  matrices  explicitly  and  save  them  for  the  Jacobian 
iterations.  As  before,  we  run  the  tests  for  one  hundred  time  steps  and  record  the  maximum 
number  of  iterations  as  well  as  the  total  number  of  iterations,  see  Table  5.  We  observe  that, 
in  terms  of  iteration  numbers  for  solving  linear  systems,  the  transonic  problem  is  easier 
to  handle  than  the  subsonic  problem,  which  is  more  of  an  elliptic  system.  Comparing  the 
Totallt,  which  equals  to  the  total  number  of  EMatVec,  in  Table  5,  and  the  total  EMatVec 
numbers  in  Table  2,  we  found  that  Jacobi  is  considerably  more  expensive  than  the  Schwarz 
preconditioned  GMRES  methods.  We  believe  that  the  required  number  of  iterations  would 
grow  much  faster  if  finer  meshes  are  used  than  that  of  the  Schwarz  preconditioned  GMRES 
methods. 
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4.4.  A  comparison  with  a  global  ILU(O)  preconditioned  GMRES  method.  In 
Table  5,  we  show  the  maximum  and  total  number  of  iterations  when  using  the  global  ILU(O) 
preconditioned  GMRES  method  for  both  test  cases.  In  terms  of  the  number  of  EMatVec, 
the  method  outperforms,  by  a  factor  of  5%  to  30%(Comparing  Table  5  and  the  bottom  part 
of  Table  2),  the  multiplicative  VDS  preconditioned  GMRES  methods  we  discussed  in  the 
paper.  Unfortunately,  its  parallelization  on  distributed  memory  computers  is  not  very  easy. 

4.5.  The  effect  of  frozen  preconditioner.  Finally,  we  examined  the  effect  of  using 
the  same  preconditioner,  or  part  of  the  preconditioner,  for  several  time  steps  without  doing 
the  factorization  at  every  time  step.  In  Table  6,  we  summerize  the  results  for  using  different 
numbers  of  frozen  steps,  namely  Froz  =  1, 5, ....  There  is  a  range  of  optimal  Froz  one  can 
choose  from;  similar  numbers  of  EMatVec  were  obtained  in  our  implementation  for  Froz 
ranging  from  5  to  50.  For  the  subsonic  case,  we  can  go  a  bit  further,  e.g.,  take  Froz  =  100. 

5.  Conclusions.  We  proposed  and  tested  a  family  of  variable  degree  Schwarz  (VDS) 
preconditioned  GMRES  methods  for  solving  linear  systems  that  arise  from  the  discretization 
of  unsteady,  compressible  N.-S  equations  on  2D  unstructured  meshes  for  both  subsonic 
and  transonic  flows  past  a  single  element  NACA0012  airfoil.  We  found  that  with  implicit 
methods,  larger  time  steps  can  be  used  and  the  overall  simulation  time  can  be  reduced 
significantly  comparing  with  the  explicit  method. 

In  VDS,  the  level  of  preconditioning  in  each  subdomain  varies  according  to  the  local  flow 
condition,  therefore  extra  preconditioning  is  performed  only  when  and  where  it  is  needed. 
For  subsonic  problems,  we  found  that  the  conditioning  of  the  subdomain  matrices  changes 
quite  a  bit  from  one  flow  region  to  another,  and  extra  local  preconditioning  in  subdomains 
in  which  the  flow  changes  drastically  can  significantly  reduce  the  total  number  of  global 
linear  iterations.  This  is  somewhat  less  obvious  for  transonic  flow,  which  needs  a  nearly 
uniformly  small  global  and  local  number  of  iterations. 

When  using  VDS,  the  best  results  are  obtained  with  small  overlap.  For  the  multiplica¬ 
tive  version,  the  convergence  rate  depends  very  mildly  on  the  number  of  subdomains  (up 
to  128  subdomains  has  been  tested),  and  for  the  additive  version,  a  slight  dependence  is 
observed  for  the  subsonic  test  problem  and  therefore  a  coarse  space  might  be  useful.  We 
also  compared  our  methods  with  the  simple  point(means  4  x4  block  for  our  multicomponent 
problems)  Jacobi  iterative  method  and  the  global  ILU(O)  preconditioned  GMRES  method. 
We  found  that  Jacobi  is  significantly  slower  than  the  proposed  methods,  especially  for  the 
subsonic  case,  and  if  sequential  computers  are  the  primary  computing  platform,  then  the 
global  ILU(O)  preconditioned  GMRES  is  a  winner  over  aU  methods  we  have  tested. 

AU  of  our  tests  were  done  on  a  sequential  computer.  Considerable  effort  is  needed  in 
order  to  obtain  a  weU-balanced  parallel  implementation.  We  remark  that  our  mesh  parti¬ 
tioning  is  obtained  without  the  knowledge  of  the  flow  condition  and  our  experiences  show 
that  a  solution  dependent  mesh  partitioning  would  offer  a  more  computationally  balanced 
decomposition  of  the  problems. 
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Fig.  6.  The  Mach  distribution  at  Mcc  —  0.84,  for  non-dimensionalized  time  t 
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Table  2 

Each  test  is  for  100  time  steps  and  at  each  time  step  the  initial  preconditioned  residual  is  reduced  by  a  factor 
of  T  =  10“^  by  using  GMRES/VDS  with  ovlp  =  1.  We  use  GMRES/ILU(0)  as  inexact  local  solvers  to 
reduce  the  local  preconditioned  residual  by  a  factor  of  6  =  10“^ .  Here  CFL=50. 


ASM 

Test  1 

Test  2 

^  subdomains 

Maxit 

Totallt 

EMatVec 

Maxit 

Totallt 

EMatVec 

8 

6 

545 

3150 

6 

519 

1471 

16 

7 

585 

3529 

6 

506 

1628 

32 

9 

673 

3851 

.  6 

560 

1864 

64 

10 

756 

4223 

6 

600 

2184 

128 

11 

842 

5621 

7 

603 

2192 

MSM 

Test  1 

Test  2 

8 

4 

292 

1613 

3 

300 

832 

16 

4 

316 

1812 

3 

300 

915 

32 

4 

320 

1834 

3 

300 

994 

64 

4 

344 

1900 

3 

300 

1089 

128 

4 

351 

2335 

3 

300 

1084 

Table  3 

GMRES  iteration  numbers  to  reduce  the  preconditioned  residual  of  Test  1  to  r  =  10”®  using  GM- 
RES / (additive  VDS)  with  8  subdomains.  We  use  GMRES/ILU(0)  as  inexact  local  solver  with  different 
local  stopping  criteria.  Here  CFL=100. 


ovlp  =  0 

ovlp  =  1 

ovlp  =  2 

ovlp  =  3 

1  iteration 

35 

43 

46 

48 

S  =  5.0e-^ 

23 

32 

33 

33 

S  =  3.0e-i 

18 

20 

,19 

19 

6  =  l.Oe-^ 

16 

15 

14 

14 

15 

12 

11 

exact 

15 

11 

10 

9 

Table  4 

The  maximum  (Maxit)  and  total  (Totallt)  local  GMRES/ILU(0)  iteration  numbers.  The  global  solver  is 
GMRES/ (multiplicative  VDS).  The  parameters  are  r  =  10"®,  S  =  10”^,  ovlp  =  1  and  the  local  solvers  are 
ILU(0).  Here  CFL=50. 


O3 

wsm 

Test  1,  Maxit 

2 

2 

2 

3 

3 

5 

Totallt 

rail 

225 

291 

421 

Test  2,  Maxit 
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2 

2 

2 

1 
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Totallt 

127 

200 

132 

185 

150 

100 
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Table  5 

The  number  of  equivalent  EMatVec  operations  needed  for  100  time  steps  starting  aM  =  1.0.  r  =  10"^ 
6  ==  CFL=50. 


4x4  Jacobi 

Global  ILU(O) 

Test  1,  Maxit 

74 

10 

EMatVec 

7299 

1472 

Test  2,  Maxit 

46 

4 

EMatVec 

4222 

800 

Table  6 

The  number  of  EMatVec  operations  needed  for  100  time  steps  starting  att=  1.0.  In  GMRES /  (multiplicative 
VDS),  T  =  10“^,  6  =  10”S  CFL=50,  number  of  subdomains  is  8  and  ovlp  =  1.  For  the  Froz=200  case^  the 
numbers  are  taken  for  200  time  steps  divided  by  2. 


Froz= 

1 

5 

10 

50 

100 

200 

Test  1,  EMatVec 

1613 

1611 

1615 

1618 

1629 

1875 

Totallt 

292 

WtiVM 

MAM 

290 

291 

321 

Test  2,  EMatVec 

832 

842 

868 

1243 

TotaUt 

300 

300 

300 

305 

315 

469 
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